Goto

Collaborating Authors

 Pretoria


'Kill the people': How men were left to starve in a South African gold mine

Al Jazeera

How men were left to starve in a South African gold mine. This image was created by Mohamed Hussein using the artificial intelligence (AI) tool Midjourney. Ayanda Ndabeni watched the faint glow from his headlamp fight the vast darkness 1,500 metres (4,920 feet) below ground. His miner's lamp had lasted for more than a week after he was lowered down into the shaft of the gold mine. But now the batteries were dying. He gently flipped the plastic switch of his lamp, turning it off, and the trapped men around him became shadows. In the stifling heat and humidity, their anxiety pressed in from all sides. Ayanda had descended into Shaft 10 of the Buffelsfontein mine in late September 2024, lowered by a team of nearly 20 men operating ropes and a pulley above ground. That day, he'd spotted police vehicles near the mine's entrance. The 36-year-old assumed it was just routine patrols around the mine system, which is 2km (1.2 miles) deep. But then the rope pulley, via which food, water, batteries and other items arrived, stopped moving. The shouting that usually indicated the rope operators were sending down a man or supplies also fell silent. When huge rocks came crashing down the shaft, they knew it was a warning. The men whispered of their growing fears that something was very wrong on the surface. Patrick Ntsokolo was also in Shaft 10. He was a few hundred metres higher up than Ayanda and had arrived in late July. Patrick was new to the mines. Tasked by the leaders of the artisanal miners with collecting the food, water and alcohol lowered down by the rope pulley, he hauled supplies along the slippery tunnels to small shops.


Towards a data-scale independent regulariser for robust sparse identification of non-linear dynamics

Raut, Jay, Wilke, Daniel N., Schmidt, Stephan

arXiv.org Machine Learning

Data normalisation, a common and often necessary preprocessing step in engineering and scientific applications, can severely distort the discovery of governing equations by magnitudebased sparse regression methods. This issue is particularly acute for the Sparse Identification of Nonlinear Dynamics (SINDy) framework, where the core assumption of sparsity is undermined by the interaction between data scaling and measurement noise. The resulting discovered models can be dense, uninterpretable, and physically incorrect. To address this critical vulnerability, we introduce the Sequential Thresholding of Coefficient of Variation (STCV), a novel, computationally efficient sparse regression algorithm that is inherently robust to data scaling. STCV replaces conventional magnitude-based thresholding with a dimensionless statistical metric, the Coefficient Presence (CP), which assesses the statistical validity and consistency of candidate terms in the model library. This shift from magnitude to statistical significance makes the discovery process invariant to arbitrary data scaling. Through comprehensive benchmarking on canonical dynamical systems and practical engineering problems, including a physical mass-spring-damper experiment, we demonstrate that STCV consistently and significantly outperforms standard Sequential Thresholding Least Squares (STLSQ) and Ensemble-SINDy (E-SINDy) on normalised, noisy datasets. The results show that STCV-based methods can successfully identify the correct, sparse physical laws even when other methods fail. By mitigating the distorting effects of normalisation, STCV makes sparse system identification a more reliable and automated tool for real-world applications, thereby enhancing model interpretability and trustworthiness.



Zero-Shot Transfer with Deictic Object-Oriented Representation in Reinforcement Learning

Ofir Marom, Benjamin Rosman

Neural Information Processing Systems

Object-oriented representations in reinforcement learning have shown promise in transfer learning, with previous research introducing a propositional objectoriented framework that has provably efficient learning bounds with respect to samplecomplexity.


Drone strikes in Ethiopia's Tigray kill one amid fears of renewed conflict

Al Jazeera

Drone strikes in Ethiopia's Tigray kill one amid fears of renewed conflict One person has been killed and another injured in drone strikes in Ethiopia's northern Tigray region, a senior Tigrayan official and a humanitarian worker said, in another sign of renewed conflict between regional and federal forces. The Tigrayan official on Saturday said the drone strikes hit two Isuzu trucks near Enticho and Gendebta, two places in Tigray about 20km (12 miles) apart. A local humanitarian worker confirmed the strikes had happened. Both asked not to be named, the Reuters news agency reported. It was not immediately clear what the trucks were carrying.


Sudan air force bombing of towns, markets and schools has killed hundreds, report says

BBC News

Sudan's air force has carried out bombings in which at least 1,700 civilians have died in attacks on residential neighbourhoods, markets, schools and camps for displaced people, according to an investigation into air raids in the country's civil war. The Sudan Witness Project says it has compiled the largest known dataset of military airstrikes in the conflict, which began in April 2023. Its analysis indicates that the air force has used unguided bombs in populated areas. The data focuses on attacks by warplanes, which only the Sudanese Armed Forces (SAF) is capable of operating. Its rival, the paramilitary Rapid Support Forces (RSF) does not have aircraft.


Hallucination reduction with CASAL: Contrastive Activation Steering For Amortized Learning

Wannan, null, Yang, null, Qiu, Xinchi, Yu, Lei, Zhang, Yuchen, Yang, Aobo, Kokhlikyan, Narine, Cancedda, Nicola, Garcia-Olano, Diego

arXiv.org Artificial Intelligence

Large Language Models (LLMs) exhibit impressive capabilities but often hallucinate, confidently providing incorrect answers instead of admitting ignorance. Prior work has shown that models encode linear representations of their own knowledge and that activation steering can reduce hallucinations. These approaches, however, require real-time monitoring and intervention during inference. We introduce Contrastive Activation Steering for Amortized Learning (CASAL), an efficient algorithm that connects interpretability with amortized optimization. CASAL directly bakes the benefits of activation steering into model's weights. Once trained, LLMs answer questions they know while abstaining from answering those they do not. CASAL's light-weight design requires training only a submodule of a single transformer layer and yet reduces hallucination by 30%-40% across multiple short-form QA benchmarks. CASAL is 30x more compute-efficient and 20x more data-efficient than strong LoRA-based baselines such as SFT and DPO, boosting its practical applicability in data scarce domains. Importantly, CASAL also generalizes effectively to out-of-distribution (OOD) domains. We showcase CASAL's flexibility in mitigating hallucinations in both text-only and vision-language models. To our knowledge, CASAL is the first steering-based training method that has been shown to be effective for both dense and Mixture-of-Experts (MoE) models. CASAL represents a promising step forward for applying interpretability-inspired method for practical deployment in production systems.


Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

De Deyn, William, Herty, Michael, Samaey, Giovanni

arXiv.org Artificial Intelligence

Artificial Intelligence has witnessed remarkable progress over the past decades, both in its capabilities and its range of applications. Today, neural networks are present in a variety of fields. One classical application is function approximation, which is supported by the universal approximation theory [34]. In computer vision, convolutional neural networks form the backbone of most modern architectures [39, 38], while the framework of neural ordinary differential equations has contributed significantly to optimal control problems [17, 10]. In natural language processing and speech recognition, recurrent neural networks and the long short-term memory variants have yielded significant performance improvements [33, 51]. More recently, diffusion models have illustrated to be powerful generative models, with applications ranging from image denoising to video generation [56]. Neural networks have even found their way into scientific computing. The most notable example is physics-informed neural networks, which are capable of solving both forward and inverse problems governed by partial differential equations [50]. A neural network can be viewed, in general, as a function parametrized by a set of weights and biases, which we collectively refer to as parameters.


TriLex: A Framework for Multilingual Sentiment Analysis in Low-Resource South African Languages

Nkongolo, Mike, Vorster, Hilton, Warren, Josh, Naick, Trevor, Vanmali, Deandre, Mashapha, Masana, Brand, Luke, Fernandes, Alyssa, Calitz, Janco, Makhoba, Sibusiso

arXiv.org Artificial Intelligence

Low-resource African languages remain underrepresented in sentiment analysis research, resulting in limited lexical resources and reduced model performance in multilingual applications. This gap restricts equitable access to Natural Language Processing (NLP) technologies and hinders downstream tasks such as public-health monitoring, digital governance, and financial inclusion. To address this challenge, this paper introduces TriLex, a three-stage retrieval-augmented framework that integrates corpus-based extraction, cross-lingual mapping, and Retrieval-Augmented Generation (RAG) driven lexicon refinement for scalable sentiment lexicon expansion in low-resource languages. Using an expanded lexicon, we evaluate two leading African language models (AfroXLMR and AfriBERTa) across multiple case studies. Results show that AfroXLMR consistently achieves the strongest performance, with F1-scores exceeding 80% for isiXhosa and isiZulu, aligning with previously reported ranges (71-75%), and demonstrating high multilingual stability with narrow confidence intervals. AfriBERTa, despite lacking pre-training on the target languages, attains moderate but reliable F1-scores around 64%, confirming its effectiveness under constrained computational settings. Comparative analysis shows that both models outperform traditional machine learning baselines, while ensemble evaluation combining AfroXLMR variants indicates complementary improvements in precision and overall stability. These findings confirm that the TriLex framework, together with AfroXLMR and AfriBERTa, provides a robust and scalable approach for sentiment lexicon development and multilingual sentiment analysis in low-resource South African languages.


Data Flows and Colonial Regimes in Africa: A Critical Analysis of the Colonial Futurities Embedded in AI Ecosystems

A, Ndaka., F, Avila-Acosta., H, Mbula-Ndaka., C, Amera., S, Chauke., E, Majiwa.

arXiv.org Artificial Intelligence

Data Flows and Colonial Regimes in Africa: A Critical Analysis of the Colonial Futurities Embedded in AI Recommendation Algorithms Angella Ndaka, University of Witwatersrand, Johannesburg, South Africa Fátima Ávila - Acosta, Berlin Graduate School of Social Sciences at Humboldt University, Berlin, Germany Harnred Mbula, Centre for Epistemic Justice, Nairobi, Kenya Christine Amera, Centre for Epistemic Justice, Nairobi Kenya Sandra Tiyani Chauke University of Pretoria, South Africa Eucabeth Majiwa Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya Abstract In the last few years, Africa has experienced growth in a thriving ecosystem of Artificial Intelligence (AI) technologies and systems, developed and promoted by both local and global technology players. While the sociotechnical imaginaries about these syst ems promote AI as critical to achiev ing Africa's sustainable development agenda, some of them have subtly permeated society, recreating new values, cultures, practices, and histories that threaten to marginalize minority groups in the region. Africa predominantly frames AI as an imaginary solution to address complex social challenges; however, the narrative subtly ignores deeper power - related concerns, including data governance, embedded algorithmic colonialism, and the exploitation that propag ates new digital colonial sites. However, the development of current AI ethics in Africa is in its infancy and predominantly framed through lenses of Western perspective, with the social and ethical impacts of the AI innovations and application on African epistemologies and worldviews not prioritized. To ensure that people on the African continent leverage the benefits of AI, these social and ethical impacts o f AI need to be critically and explicitly considered and addressed. This chapter will therefore seek to frame the elemental and invisible problems of AI and big data in the African context by examining digital sites and infrastructure through the lens of power and interests. It will present reflections on how these sites are using AI recommendation algorithms to recreate new digital societies in the region, how they have the potential to propagate algorithmic colonialism and negative gender norms, and what this means for the regional sustainable development agenda. The chapter proposes adopting business models that embrace response - ability and consider the existence of alternative socio - material worlds of AI. These reflections will mainly come from ongoing discussions with Kenyan social media users in this author's user space talks, which take place every month. Keywords: Artificial Intelligence; algorithmic colonialism; Data; response - ability; digital sites Section 1: Introduction The growing global interest, combined with rising investments in AI skilling and infrastructure development, is a key driver of the expanding landscape of AI technologies and systems across Africa.